iT邦幫忙

2022 iThome 鐵人賽

DAY 18
0

除了昨天新的LSTM,GRU和原本的GlobalAveragePooling1D,還有Flatten和之前學的Convolution都拿來比較用用看,我們改用IMDB沒有subword的資料,自己建字典的版本,單字庫設定10000個,而句子長度設120,這次從簡單的開始。

Flatten:

# Parameters
embedding_dim = 16
dense_dim = 6

# Model Definition with a Flatten layer
model_flatten = tf.keras.Sequential([
    tf.keras.layers.Embedding(vocab_size, embedding_dim, input_length=max_length),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(dense_dim, activation='relu'),
    tf.keras.layers.Dense(1, activation='sigmoid')
])

# Set the training parameters
model_flatten.compile(loss='binary_crossentropy',optimizer='adam',metrics=['accuracy'])

結果一開始就有很高的accuracy和很低的loss:
https://ithelp.ithome.com.tw/upload/images/20220922/20141158HTp4nOxCGS.png

GlobalAveragePooling1D:

# Parameters
embedding_dim = 16
dense_dim = 6

# Model Definition with a Flatten layer
model_pool = tf.keras.Sequential([
    tf.keras.layers.Embedding(vocab_size, embedding_dim, input_length=max_length),
    tf.keras.layers.GlobalAveragePooling1D(),
    tf.keras.layers.Dense(dense_dim, activation='relu'),
    tf.keras.layers.Dense(1, activation='sigmoid')
])

# Set the training parameters
model_pool.compile(loss='binary_crossentropy',optimizer='adam',metrics=['accuracy'])

結果如下:
https://ithelp.ithome.com.tw/upload/images/20220922/20141158yHy2hog2r5.png

一層LSTM:

# Parameters
embedding_dim = 16
lstm_dim = 32
dense_dim = 6

# Model Definition with LSTM
model_lstm = tf.keras.Sequential([
    tf.keras.layers.Embedding(vocab_size, embedding_dim, input_length=max_length),
    tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(lstm_dim)),
    tf.keras.layers.Dense(dense_dim, activation='relu'),
    tf.keras.layers.Dense(1, activation='sigmoid')
])

# Set the training parameters
model_lstm.compile(loss='binary_crossentropy',optimizer='adam',metrics=['accuracy'])

一個epoch在colab上約4秒,結果如下:
https://ithelp.ithome.com.tw/upload/images/20220922/20141158gAV0wf4Lcj.png

兩層LSTM:

# Parameters
embedding_dim = 16
lstm_dim1 = 64
lstm_dim2 = 32
dense_dim = 6

# Model Definition with LSTM
model_lstm2 = tf.keras.Sequential([
    tf.keras.layers.Embedding(vocab_size, embedding_dim, input_length=max_length),
    tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(lstm_dim1, return_sequences=True)),
    tf.keras.layers.Bidirectional(tf.keras.layers.LSTM(lstm_dim2)),
    tf.keras.layers.Dense(dense_dim, activation='relu'),
    tf.keras.layers.Dense(1, activation='sigmoid')
])

一個epoch在colab上約7秒,結果如下:
https://ithelp.ithome.com.tw/upload/images/20220922/20141158Db4r8bR5ix.png

GRU:

# Parameters
embedding_dim = 16
gru_dim = 32
dense_dim = 6

# Model Definition with GRU
model_gru = tf.keras.Sequential([
    tf.keras.layers.Embedding(vocab_size, embedding_dim, input_length=max_length),
    tf.keras.layers.Bidirectional(tf.keras.layers.GRU(gru_dim)),
    tf.keras.layers.Dense(dense_dim, activation='relu'),
    tf.keras.layers.Dense(1, activation='sigmoid')
])

一個epoch在Colab上約4秒,結果如下:
https://ithelp.ithome.com.tw/upload/images/20220922/20141158tNhABk8BNG.png

Conv1D接GlobalAveragePooling1D:

# Parameters
embedding_dim = 16
filters = 128
kernel_size = 5
dense_dim = 6

# Model Definition with Conv1D
model_conv = tf.keras.Sequential([
    tf.keras.layers.Embedding(vocab_size, embedding_dim, input_length=max_length),
    tf.keras.layers.Conv1D(filters, kernel_size, activation='relu'),
    tf.keras.layers.GlobalAveragePooling1D(),
    tf.keras.layers.Dense(dense_dim, activation='relu'),
    tf.keras.layers.Dense(1, activation='sigmoid')
])

結果如下:
https://ithelp.ithome.com.tw/upload/images/20220922/20141158TnP69jFCJl.png

Conv1D接Flatten:

# Parameters
embedding_dim = 16
filters = 128
kernel_size = 5
dense_dim = 6

# Model Definition with Conv1D
model_conv_flat = tf.keras.Sequential([
    tf.keras.layers.Embedding(vocab_size, embedding_dim, input_length=max_length),
    tf.keras.layers.Conv1D(filters, kernel_size, activation='relu'),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(dense_dim, activation='relu'),
    tf.keras.layers.Dense(1, activation='sigmoid')
])

結果如下,曲線特別平滑:
https://ithelp.ithome.com.tw/upload/images/20220922/20141158H6V6HOIRyQ.png

其中只有用flatten一開始就有很高的accuracy和很低的loss比較特別,另外模擬時間除了LSTM和GRU比較長外,其他沒標註的單個epoch都在1至2秒。


上一篇
Day16 菜鳥的練功課程-上下文
下一篇
Day18 菜鳥的練功課程-文字接龍
系列文
來創造一個AI角色吧-新手的探尋之路30
圖片
  直播研討會
圖片
{{ item.channelVendor }} {{ item.webinarstarted }} |
{{ formatDate(item.duration) }}
直播中

尚未有邦友留言

立即登入留言